Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce Snapshot Isolation OCC to DBTransaction #18

Closed
wants to merge 1 commit into from

Conversation

CMCDragonkai
Copy link
Member

@CMCDragonkai CMCDragonkai commented May 3, 2022

Description

This implements the snapshot isolation DB transaction.

This means DBTransaction will be automatically snapshot isolated, which means most locking will be unnecessary.

Instead when performing a transaction, there's a chance for a ErrorDBTransactionConflict exception which means there was a write conflict with another transaction.

Users can then decide on their discretion to retry the operation if they need to (assuming any non-DB side-effects are idempotent, noops or can be compensated). This should reduce the amount of locking overhead we need to do in Polykey. We may bubble up the conflict exception to the user, so the user can re-run their command, or in some cases, in-code we will automatically perform a retry. The user in this case can be the PK client, or the another PK agent or the PK GUI.

There is still one situation where user/application locks are needed, and that's where there may be a write-skew. See snapshot isolation https://en.wikipedia.org/wiki/Snapshot_isolation for more details and also https://www.cockroachlabs.com/blog/what-write-skew-looks-like/.

In the future we may upgrade to SSI (serializable snapshot isolation) which will eliminate this write-skew possibility.

Additionally this PR will also enable the keyAsBuffer and valueAsBuffer options on the iterators, enabling easier usage of the iterators without having to use dbUtils.deserialize<T>(value) where it can be configured ahead of time.

Issues Fixed

Tasks

  • 1. Added additional overloads to iterator for keyAsBuffer and valueAsBuffer
  • 2. Added transaction snapshot iterator and tie iterator lifecycle with the DBTransaction lifecycle
  • 3. Add snapshotLock to ensure mutual exclusion when using the snapshot iterator
  • 4. Implemented getSnapshot as the last-resort getter for DBTransaction.get
  • 5. Change DBTransaction.iterator to use the snapshot iterator
  • 6. Upgraded dependencies to match TypeScript-Demo-Lib (excluding node v16 upgrade)
  • 7. Add tests for keyAsBuffer and valueAsBuffer usage on the iterator, expect string and V types. Existing tests should still work. Since by default iterator returns buffers. This is different from get which by default does not return raw buffers.
  • 8. Maintain the last updated timestamp for each key being updated in transactions and also in the main DB
  • 9. Detect write conflicts and throw ErrorDBTransactionConflict

Final checklist

  • Domain specific tests
  • Full tests
  • Updated inline-comment documentation
  • Lint fixed
  • Squash and rebased
  • Sanity check the final build

@CMCDragonkai CMCDragonkai self-assigned this May 3, 2022
@CMCDragonkai
Copy link
Member Author

@emmacasolin @tegefaulkes watch this PR!

@CMCDragonkai
Copy link
Member Author

Now I have an interesting test to demonstrate the snapshot's capabilities.

import { withF, ResourceRelease } from '@matrixai/resources';
import { DB } from './src';
import * as testsUtils from './tests/utils';

async function main () {

  const key = await testsUtils.generateKey(256);

  const db = await DB.createDB({
    dbPath: './tmp/db',
    crypto: {
      key,
      ops: { encrypt: testsUtils.encrypt, decrypt: testsUtils.decrypt },
    },
    fresh: true
  });

  await db.put('key', 'value');

  const t1 = withF([db.transaction()], async ([tran]) => {
    console.log('T1 START');
    await testsUtils.sleep(100);
    console.log('T1: LOADING KEY2', await tran.get('key2'));
    console.log('T1 FINISH');
  });

  // T2 starts after T1 starts
  const t2 = withF([db.transaction()], async ([tran]) => {
    console.log('T2 START');
    await tran.put('key2', 'value2');
    console.log('T2 FINISH');
  });

  await t2;
  await t1;

  const t3 = withF([db.transaction()], async ([tran]) => {
    console.log('T2 START');
    console.log('T3: LOADING KEY2', await tran.get('key2'));
    console.log('T3 FINISH');
  });

  await t3;

  await db.stop();

}

main();

In master the output would be:

T1 START
T2 START
T2 FINISH
T1: LOADING KEY2 value2
T1 FINISH
T2 START
T3: LOADING KEY2 value2
T3 FINISH

With the new snapshot iterator, the result is:

T1 START
T2 START
T2 FINISH
T1: LOADING KEY2 undefined
T1 FINISH
T2 START
T3: LOADING KEY2 value2
T3 FINISH

With SI, because T1 and T2 are started at the same time, they get the original state of the database, and T2 commits before T1 reads the key, but because T1 still has the original snapshot, it continues to read undefined.

@CMCDragonkai
Copy link
Member Author

To allow future debugging, I've added debug logging statements for DBTransaction state transitions, creating, destroying, committing, rollbacking, finalizing.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented May 3, 2022

I just realised it's not possible to get a derived iterator for DBTransaction.iterator based on the snapshot iterator. This is simply because if the option is reverse: true, this wouldn't be applied to the snapshot iterator, therefore it couldn't be used.

That will mean, that until we can have access to raw iterators (Level/leveldown#486 (comment)), if users want to have a consistent iterators with respect to the transactional snapshot, the iterators must be created at the beginning of the transaction. Any subsequent creation would take the snapshot of the DB state at that point taking into account any changes to the DB state by other committed transactions. I'm not fully sure if this is the case because createTransaction is asynchronous. We may need to do some tests to see if is possible to intercept another asynchronous operation in between starting the transaction and creating the iterator.

We can do a synchronous constructor call instead if not.

Anyway this should mean that consistent iteration can only be done with:

withF([db.transaction()], async (tran) => {
  const i1 = tran.iterator();
  const i2 = tran.iterator();
  // do work afterwards
});

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented May 3, 2022

I've found a scenario where creating an iterator up front is still not consistent. This may mean we need to change the way we construct our transaction (to be more synchronous).

import { withF } from '@matrixai/resources';
import { DB } from './src';
import * as testsUtils from './tests/utils';

async function main () {

  const key = await testsUtils.generateKey(256);

  const db = await DB.createDB({
    dbPath: './tmp/db',
    crypto: {
      key,
      ops: { encrypt: testsUtils.encrypt, decrypt: testsUtils.decrypt },
    },
    fresh: true
  });

  await db.put('key', 'value');

  const p = db.put('key', 'value3');

  const t1 = withF([db.transaction()], async ([tran]) => {
    const i1 = tran.iterator({ keyAsBuffer: false, valueAsBuffer: false });

    console.log('CREATED ITERATOR');

    i1.seek('key')
    const v1 = (await i1.next())![1];
    await i1.end();

    const v2 = await tran.get('key');

    console.log(v1, v2, v1 === v2);
  });

  await p;
  await t1;

  await db.stop();
}

main();

Here we see that the newly created iterator gets value3, whereas the snapshot iterator gets value. This is not consistent.

@CMCDragonkai
Copy link
Member Author

There's an old fork of leveldown that did expose snapshots to the interface: https://github.com/frankboehmer/leveldown/commits/master

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented May 3, 2022

Ok by making DBTransaction construction fully synchronous, we do not have the above problem. However the above problem can still arise, as long as there is a time-difference between constructing the transaction, and performing the first non-snapshot iterator creation.

      const tran = new DBTransaction({
        db: this,
        transactionId,
        logger: this.logger,
      });

This can happen as long as you do withF([db.transaction(), someOtherResource()], async ([tran]) => { /* ... */ });. Simply because the withF will do async work in between acquiring the resource and running the transaction.

A different foolproof trick would be delay the main snapshot iterator creation until the first get or iterator call is made. This would guarantee that we would have had to be in the resource handler callback when the first snapshot iterator is created.

But it also means that snapshot iterator lifecycle is not tied directly to transaction lifecycle. It also makes snapshot creation lazy, in that if a transaction only does put or del, it does not bother creating that initial consistent snapshot. This can be a bit confusing for end-users to reason about when consistent snapshot begins.

Instead one can trigger snapshot creation as soon as get, put, del, or iterator method is called. That way, it is at least observable that only when the tran is actually interacted with does the consistent lifecycle start.

The best solution is for leveldb to expose snapshots directly and for that we need to work in C++ and fork the leveldb for js-db.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented May 3, 2022

Expecting iterators to be created up front is not going to be good for usability. Consider that in PK our transactions may be created high the call graph, and the context may be passed down to many domain methods. It is therefore important that downstream users can expect consistency when creating their iterators. Maybe it's time to start the C++ journey!

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented May 4, 2022

From https://github.com/google/leveldb/blob/main/doc/index.md#snapshots.

In C++, snapshots are used like this:

leveldb::ReadOptions options;
options.snapshot = db->GetSnapshot();
// ... apply some updates to db ...
leveldb::Iterator* iter = db->NewIterator(options);
// ... read using iter to view the state when the snapshot was created ...
delete iter;
db->ReleaseSnapshot(options.snapshot);

Here we can see that even after mutating the db, just simply creating the iterator with db->NewIterator(options) passes the snapshot in the ReadOptions struct (I'm guessing it's a struct), and ensures that the iterator reads from the same snapshot.

This basically means that read operations using get or iterator can make use of the same snapshot.

This means the final API might like this:

// Internally inside `DBTransaction`
const snapshot = db.snapshot();
const iterator = db.iterator({ snapshot }, levelPath);

// Externally at the end-user
await withF([this.db.transaction()], async ([tran]) => {
  const i1 = tran.iterator();
  const i2 = tran.iterator();
  // we can expect that `i1` and `i2` are consistent with the transactional snapshot iterator
});

The snapshot option is passed in automatically into our AbstractIteratorOptions. We may extend this type to also take the snapshot object.

@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented May 4, 2022

The forked implementation of snapshots frankboehmer/leveldown@a78ddc1 done in 2018 used the NAN mechanism. This was an older way of creating native addons: https://stackoverflow.com/a/59440434/582917. The newer way uses NAPI, so we will have to study both and compare what will be needed.

@CMCDragonkai
Copy link
Member Author

The lmdb-js is an interesting solution that also has transaction support, however we know leveldb well enough that we can't pursue this alternative yet 6107d9c#r73267460.

@CMCDragonkai
Copy link
Member Author

The lmdb-js project if we switch to that, could potentially solve a number of issues:

It would be interesting to explore this in the future, since that would indicate alot of work, and we still have to compare with potentially using rocksdb instead.

Updated dev dependencies in package.json and synchronised .eslintrc

Updated typedoc generation

Canary option was never used

Updated jest.config.js to use ts-jest instead of ts-jest/utils

Enabled keyAsBuffer and valueAsBuffer options for iterators, still defaults to Buffer

WIP

Typo for ErrorDBTransactionCommited to ErrorDBTransactionCommitted

WIP keyAsBuffer and valueAsBuffer

WIP dump V types

WIP snapshot is the get of last resort for tran

WIP

WIP

WIP

WIP
@CMCDragonkai
Copy link
Member Author

@emmacasolin I had a look at this current branch after I rebased on top of master.

I see that the package-lock.json wasn't updated on the last release.

You should make sure that npm install was performed with the necessary package changes.

Also I think you removed a necessary devDependency being of @types/node-forge.

Can you create another feature branch, enter nix-shell, add back @types/node-forge and do a full npm install which should update package-lock.json as well. And then merge, and do another publish to 3.3.4 (generate docs).

@CMCDragonkai
Copy link
Member Author

Closed in favour of #19. That's now the new feature branch. I'll be bringing in the changes from TypeScript-Demo-Lib there too.

@emmacasolin emmacasolin mentioned this pull request May 25, 2022
4 tasks
@CMCDragonkai CMCDragonkai deleted the control branch May 26, 2022 08:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

DBTransaction with Pessimistic Locking & Optimistic Locking and Deadlock Detection
1 participant